Analysis of Automatic Stress Assignment in Slovene
نویسندگان
چکیده
We tested the ability of humans and machines (data mining techniques) to assign stress to Slovene words. This is a challenging comparison for machines since humans accomplish the task outstandingly even on unknown words without any context. The goal of finding good machine-made models for stress assignment was set by applying new methods and by making use of a known theory about rules for stress assignment in Slovene. The upgraded data mining methods outperformed expert-defined rules on practically all subtasks, thus showing that data mining can more than compete with humans when constructing formal knowledge about stress assignment is concerned. Unfortunately, compared to humans directly, the data mining methods still failed to achieve as good results as humans on assigning stress to unknown words.
منابع مشابه
Automatic Accentuation of Words for Slovenian TTS System
The accentuation of unknown Slovene words represents a challenging task for automated solvers since in Slovenian, stress can be located on arbitrary syllables. Most words have only one stressed syllable, but there exist also words with no stress and words with more than one stress. Furthermore, different forms of the same word can be stressed differently. In this paper, we present a two level l...
متن کاملAutomatic Lexical Stress Assignment of Unknown Words for Highly Inflected Slovenian Language
This paper presents a two level lexical stress assignment model for out of vocabulary Slovenian words used in our text-to-speech system. First, each vowel (and consonant 'r') is determined, whether it is stressed or unstressed, and a type of lexical stress is assigned for every stressed vowel (and consonant 'r'). We applied a machine-learning technique (decision trees or boosted decision trees)...
متن کاملMachine Learning of Morphosyntactic Structure: Lemmatizing Unknown Slovene Words
Automatic lemmatization is a core application for many language processing tasks. In inflectionally rich languages, such as Slovene, assigning the correct lemma (base form) to each word in a running text is not trivial, since for instance, nouns inflect for number and case, with a complex configuration of endings and stem modifications. The problem is especially difficult for unknown words, sin...
متن کاملLearning to Lemmatise Slovene Words
Automatic lemmatisation is a core application for many language processing tasks. In inflectionally rich languages, such as Slovene, assigning the correct lemma to each word in a running text is not trivial: nouns and adjectives, for instance, inflect for number and case, with a complex configuration of endings and stem modifications. The problem is especially difficult for unknown words, as wo...
متن کاملEffect of Cognitive Behavioral Thearpy Based Psychoeducation Program on Unıversity Students\' Automatic Thoughts, Perceived Stress and Self-Efficacy Levels
Background: University life is a special period in which students take full responsibility for their own lives, especially as individuals, and therefore includes many positive and negative situations. As a result of this situation, they need serious psychological support in order to cope with the potential or real problems they experience. The research was conducted to determine the effect of C...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Informatica, Lith. Acad. Sci.
دوره 20 شماره
صفحات -
تاریخ انتشار 2009